The Triangulation Method

File:Wooden structure in the shape of an arch between two piers, Martin's Creek viaduct construction, Kingsley, Pennsylvania, October (TRANSPORT 944).jpg
Wooden structure in the shape of an arch between two piers, Martin’s Creek viaduct construction, Kingsley, Pennsylvania, October (TRANSPORT 944) - [commons]. I find this a useful visual analogy to the relationship between tests (or specs) and implementation. The specs being the scaffold which determine the implementation (the arch to be built). One could theoretically discard those afterwards like the scaffold (but since they double as regression tests, we don’t do that).

In the last^[1] article we spoke about, among other things, how values that get passed into functions can be arbitrarily complex. We can pass in scalar values like 4, lists like (2,3), maps like {:a 1 :b 2}, and nest these into arbitrary “structures”, like {:a (2,3)} (using Clojure-y syntax). One can easily see that the number of permissible values of a certain structure for a given function can become very large very quickly. Let’s assume we are allowed to pass lists of exactly three digits (the form being (d,d,d)), we get 10 x 10 x 10 = 1000 possible values we can pass. But of course we could allow passing in ASCII characters as well, or allow lists of arbitrary length, or allow lists to be nested in whichever way we need.

A system is composed of functions. The bigger the system, the more functions it contains. While functions in textbook examples and lower-level functions that usually come with the language in standard packages take simple values, or at least values of simple structure, the higher up you go in the composition stack, the more complex the values passed into system functions tend to become. As I argued last time, one can interpret things such that, for example, the state of an entire database could be seen as an input argument to a system-scope function. This complexity is hidden because programming languages are designed precisely to make programming cognitively feasible.

I mention this specifically to strengthen the intuition that it makes sense that module-level functions of mid-sized complexity tend to take structures as input arguments. Even if a function only takes 4 arguments, you will often find that you pass an object, or an accessor to the database, a config, or a complex datastructure representing a form filled in by a user, or what have you. I think these structures must be a reflection of sorts of the internal complexity of large composed functions.

But in general the fact that the input sets are huge plays an important role for the discussion further below which will spell out what that means for testing.

In any case, the innards of a function itself can be constructed in myriad ways. They can react in any number of ways to the complex input data. Even for simple input data they can, in fact. A square function might take a number and square it, and do so for every number. But then a function can also partition the inputs into classes of inputs and form outputs after making a case distinction. For example, if the input number is less than 5, square it. If it is greater than or equal to 5, add 4 to it.

Depending on the level at which the programmer works, he might draw on existing functions, which are already composed of smaller functions. At some level, however, he hits bottom, which is usually basic addition or iteration implemented at the hardware level (multiplication might be implemented either in hardware or software). Multiplication vs. addition is actually an instructive example anyway. A square number calculation may be expressed as simply as (* x x), but when multiplication is not available as an atomic operation, something lower-level is needed.

(defn multiply [a b] (reduce + (repeat b a)))

This function, which for simplicity we assume takes only positive values, is implemented in terms of two other functions, one of which groups the same number a, b times. The other takes the list thus generated and adds all its elements. If we had no recourse to repeat and reduce, we would need to implement this in terms of (yet lower-level) iteration.

In any case, we are getting slowly to the heart of the matter. When we program, at some level we know what inputs and outputs our program takes or produces, and which outputs it should produce for which inputs. These are our functional requirements. And when we are very diligent, we capture these requirements (convert them from natural language obtained through difficult conversations) as pairs of inputs and outputs - which we then call examples. Of course, requirements gathering is a laborious process and communication is complicated by the limits of language. But that need not concern us here.

The question always is—and this shall be our concern here—after the requirements have been gathered or discovered, how to bring about the desired results. That is, how to construct the system. Here we will concentrate on the functional requirements, i.e., the input-to-output mapping, rather than the so-called non-functional requirements, like speed, memory, security, or usability, because that is complex enough for an interesting discussion. But we should at least mention that these play a major role in decision-making regarding how we construct the system.

Now, the input-output mapping constrains the system to be built, but doesn’t determine it. The engineer’s task is precisely to figure out what can bring about the desired outcomes, and there are infinitely many ways. For 4 is 3 + 1, but also 3 + 1 - 1 + 1. The obvious mathematical fact is that infinite variations can produce the same result. But there’s a subtler point we should consider. Let me give another example.

Perhaps you’ve seen this textbook example of regression. You have a bunch of dots that look roughly linear, and there’s a mathematical way to fit a line through them. However, as is (hopefully) pointed out, choosing a linear model—fitting a straight line—is a choice of model family. We could equally choose polynomial functions or other function families to fit the same data points. Each choice represents a fundamentally different mechanism. The characteristics of the chosen function family are entirely up to the statistician.

A similar thing applies to the engineer and the system-function. The system can be constructed in any number of ways, each with different internal mechanisms yet producing the same input-output behavior. The reason for variations fall into three categories.

Non-functional requirements. Algorithm design is the perfect example.
Unnecessary redundancies (which can be refactored under tests, such that the code can do the same with less).
This is were the statistics analogy is most useful. It could be that the function produces unspecified outputs for unspecified inputs, in addition to the specified ones. This may or may not constitute a problem. It is the task of the engineer to determine which cases need to be covered such that no undesired input-output behaviour takes place. A fundamental fact of testing which the engineer has to deal with is that of an always limited number of test examples.^[2]

Obviously, in any case, given the constraints (the specification), we want to find the sparsest implementation.

Also, by the fact that there are software / computer engineers (and not only computer scientists) - we know that the implementation cannot be computed from the specification. I think I have seen at some point that this is a research subject, which probably borders on or is subsumed under ‘formal proofs.’ The details do not interest me here. But given that I write this in 2025, I feel I should at least mention in passing here that there is an ongoing discussion, spurred by new possibilities LLMs bring to the mix, of whether the code or the specs should be seen as the ‘primary’ artefact.^[3] Let’s see how that discussion goes. But for the present argument it makes zero difference, for I simply analyse the relationship between specification and implementation insofar as the implementation is generated from the specification. We keep both anyways, but which one you would be willing to theoretically discard afterwards doesn’t matter here.^[4]

Returning to the main topic, the sparsest way is the cheapest in some sense. And it’s the engineer’s (or, well, the “implementer’s”^[5]) job to find that most economical solution. While no algorithm exists to derive it from a specification, as mentioned earlier, there are methodical ways to create it (and method is what one would think distinguishes him from laymen).

First of all, it involves specifying relevant input and output pairs. How to find those is the question, of course. For now, suffice it to say that we simply need to start with some.

Example: Square number calculation.

Input: 4; Output: 16.

Two simple implementation choices (in Clojure) are the following:

1. (case x 4 16)
2. (if (= 4 x) 16 117)

For un-specified input values, the first implementation throws an exception, in which case we interpret our function to be restricted in its domain. In the second case, we map all other input values to an arbitrarily chosen default output value. Both implementations are fine, but of course we want to implement square numbers in general. Thus we extend the domain to include all valid numbers in the mathematical squaring operation. So we continue by specifying:

Input: 5; Output: 25.

Possible implementation:

(case x 4 16 5 25)

We are forced to change the implementation to satisfy a second constraint.

This process of adding additional constraints to force implementation changes—I’ll never forget—was one of the biggest a-ha moments in my career. This technique is called triangulation and I learned it from the book Test-Driven Development by Kent Beck.

Note that the implementation naturally follows (is “driven”) from the constraints. And also that the implementation builds on an existing implementation.

Of course, aside from the fact that it’s impossible to implement this particular function for the entire intended domain in limited time, the intelligent thing to do is to find a general way to compute the results. In this case we can, for example, do either

(* x x)

(reduce + (repeat b a))

depending on what lower level functions are available to us; in fact, we can do this at any level, it’s “functions all the way down,” and we can compose whichever functions are available and handy to us, as long as the constraints as given by the input-output pairs are satisfied!

But here’s the problem. Aside from the things we mentioned earlier and wanted to revisit, the difference between building a system and our neat toy examples from mathematics is the following: In mathematical functions, it’s their very construction that is known beforehand. Whereas in engineering, it’s precisely the challenge to find the most elegant way to express what needs to be constructed!

Usually however, when one starts with the requirements, the question is less how we construct the solution, but rather how to find the right input-output examples. The implementation should emerge from that, which is wherein the methodical aspect lies. And the more pairs we specify, the more generic the implementation becomes.

The challenge lies in the selection of pairs, since we cannot specify all of them. This was what was alluded to initially with what impact the size of input sets has on testing. Namely that this becomes a question of which cases actually make a difference.

Let’s get back to the example already mentioned here: “if the input number is less than 5, square it. If it is greater than or equal to 5, add 4 to it.”

3 yields 9
4 yields 16
5 yields 9
6 yields 10

So we have two distinguishable classes of outputs here, with a clear cutoff point between them, around which we need to construct our test cases. We should be able to derive what “makes a difference” from how the output is “partitioned”. Implementation-wise, this means an if-branch.

Whereas in the math-example we knew the contruction beforehand, ideally in a real-world setting we would “discover” the construction entirely based on requirements, broken down to examples. Practically, however, working on implementation details informs how we look at it from the outside. There is no necessity to strictly treat function internals like a black box.^[6] Also, possibly needless to say, the described technique is independent of the scale at which we apply it. That is, it works for functions of any size (of composition).

So as far as testing is concerned, we have two essential concepts at play here. Identification of relevant examples and triangulation, which is the process by which the implementation “grows” naturally, as it is required to satisfy all of the given constraints at the same time. Any additional case, as in, new requirement, forces an adjustment in the function’s internals.

As simple as this is, I found it quite mind-blowing at the time. Having a concept (like triangulation in this case) named gives it a different sort of handle and makes it a powerful tool in one’s toolbox which can be conciously deployed.

I was amazed to learn at some point that triangulation is also a concept in psychology. There it refers to how the presence of a third person modulates the communication between two people. Triangulation originally refers to a surveying and navigation technique used to determine the location of a point by measuring angles to it from known points at either end of a fixed baseline (applying knowledge of the geometry of triangles). It serves as a useful analogy in the testing context because we can express that a third thing (the implementation) is determined from two other things, which is two say, first one requirement, like an input of 4 in the squares example, and then another requirement, like the following input of 5 in the testing example.

In any case, we will talk a bit more about the link between testing and implementation in a subsequent article.

Footnotes

Testing Functional Requirements - here.
Property-based testing doesn’t change that in the sense that it doesn’t free the engineer from thinking through classes (the possible different cases) of inputs. For that, it doesn’t matter how many different inputs of a given class you throw at a function.
See for example Understanding Spec-Driven-Development: Kiro, spec-kit, and Tessl - martinfowler.com.
Spec-driven development with AI: Get started with a new open source toolkit (github.blog) makes an interesting argument here, namely that specs capture intent, and this makes them suitable as a basis for complete rewrites of existing software. Key question in general is, if the spec is the primary artefact (which in some sense it is naturally) how reproducible can we get this process to generate code from spec?
Could be an LLM.
This is why I always tended to take the test-first implication of test“-driven” aspirationally rather than literally. It felt always easier to figure things out “along the way.” Though, interestingly, with AI-assisted coding now, I feel like I want to write specs first. The reasons, are, I think, that leveraging LLM in testing makes things cogntively easier, insofar as less rigorosity is required where natural language substitutes for both test-code and code for constructing test-harnesses.

The Triangulation Method

Footnotes

# Comments